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SUMMARY 

There is no question that process automation is an absolute 
requirement of modern production facilities. Furthermore, 
countless studies have demonstrated a direct link between 
overall process performance and the performance of their 
underlying control system. The first and most obvious ben- 
efit of process automation is the resulting increase in pro- 
ductivity as measured by production output per employee. 
This increase in productivity is due to both the improved 
ability to run against production constraints and the reduced 
workload on plant operators. Modern processes are designed 
with automation in mind and are generally so complex and 
interactive that it would be unfeasible to operate them manu- 
ally. Secondly, automation improves the reliability and safety 
of processing facilities. An integral part of every control 
system is the safety and interlock systems, which generally 
include automated shutdown procedures, which can be trig- 
gered in the case of emergency. There is a synergy between 
optimal control and integrated safety systems that allows 
modern plants to operate much closer to equipment limits 
without jeopardizing the safety of the employees or exposing 
the equipment to unacceptable risk of damage. And finally, 
process control significantly enhances process efficiencies, 
allowing production of goods to tighter quality specifications 
and with minimal use of energy and raw materials. These 
efficiency gains can be derived from the ability to run closer 
to an operational constraint or simply by reducing unneces- 
sary process variation. 

BACKGROUND 

Studies in the early 1990s and published industrial experi- 
ence [2,7] had shown that roughly 30% of all controllers are 


in manual while only 20% actually reduce variation while 
in auto. Similar findings resulted from a study performed by 
the author while working with a large pulp and paper manu- 
facturer. Figure 11.1 shows the control usage for 1 week from 
four large integrated pulp and paper facilities. The situation 
in this case was not as dire as that described in the earlier ref- 
erences but nevertheless 12% of their feedback control loops 
were never placed in auto. Furthermore, Figures 11.2 and 
11.3 show that of the 1556 controllers examined, 13% suf- 
fered from strong oscillations and 30% appeared to respond 
to disturbances either too aggressively or slower than desired. 

The dollar value of these performance issues is hard to 
estimate and depends greatly on the specific controller and 
process area. Nevertheless, it is reasonable to expect that the 
overall economic impact of these problems is significant. It 
is concluded from these observations that there is a serious 
need to manage control performance and a necessary first 
step is to measure control performance. 

The recognition of the poor state of control performance 
in industry has led to an explosion of technology and services 
offered in this area. These include the following: 

1. Offline data acquisition and analysis equipment: This 
equipment is connected directly to analog or digital 
devices in the field and is often supplied with proprie- 
tary data analysis and controller tuning software. This 
equipment requires significant effort to set up in the 
field but it can often collect at 100 ms sample rates or 
faster. This can be very useful when designing appro- 
priate filters or diagnosing actuator and measurement 
problems. Figure 11.4 exemplifies a typical data acqui- 
sition gear collecting 100 ms data from a flow trans- 
mitter and a portable valve position sensor. This type 
analysis is very detailed and precise but it is also time- 
consuming and expensive. Out of necessity, it tends to 
be reserved for those high-value applications that are 
showing signs of trouble. 

2. Interactive network data collection and analysis 
software: These tools acquire data directly from the 
control system over a network using the OPC (object 
linking and embedding for process control), NetDDE, 
or some other communication protocol. The OPC 
data communication standard is by far the leader in 
the process-control field and has been adopted by all 
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FIG. 11.1 

Controller usage. 
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FIG. 11.2 

Controller oscillations. 
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FIG. 11.3 

Controller speed of response. 


major software vendors. These tools are generally lim- 
ited to 1 s sample rates and can suffer from excessive 
filtering and data compression applied by some of the 
older control systems. Nevertheless, these tools have 
virtually replaced the field data acquisition devices for 
their convenience and ease of use. And they can be 
very powerful analytical tools if proper care is taken 
to insure signal quality. Figure 11.5 is a screen capture 



FIG. 11.4 

Offline data acquisition and analysis of process signals. 


from a closed-loop testing and data acquisition soft- 
ware package. In the example shown, a dither signal, or 
random setpoint movement imposed by the software, 
is used to excite the process and identify the open-loop 
dynamics for later control analysis and tuning. 

3. Autonomous data collection and analysis software: 
These packages reside on the manufacturing site con- 
trol and business networks and continuously collect 
data from the control system, typically via an OPC 
connection. A standard arrangement is shown in 
Figure 11.6 although far more complex network archi- 
tectures are possible. In the scenario shown, normal 
closed-loop operating data from the site control system 
are continuously sampled via OPC and stored in the 
site historian. The site historian is generally kept on 
the control local area network (LAN) close to the con- 
trol system in order to minimize network latency and 
traffic. The analysis software in this example is hosted 
on a computer on the business LAN for easy access by 
company employees at the local site or elsewhere. The 
analysis of control performance occurs automatically 
on a pre-configured schedule. Once an analysis is trig- 
gered, the required process data are delivered by the 
site data archive, through the firewall to the analysis 
server where performance reports are generated and 
made available over the web. These systems were ini- 
tially quite limited but are growing in popularity. The 
limitations imposed by the automated data collection 
and analysis are being outweighed by the ability to 
track key performance indicators for every controller 
in the facility, which often numbered in the thousands. 
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FIG . 11.5 

Interactive OPC-based Control Data Acquisition and Analysis Software. 





FIG . 11.6 

Simple network architecture. 
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Brisk and Blackhall (1995) reported that a best-in-class 
process facility will have no more than 150 control loops for 
every process control engineer. The 15 years since have seen a 
steady increase in the level of automation in most plants while 
the technical work force continues to get smaller. It is probable 
that most plants in the chemical process industries have much 
more than 150 controllers for each process control engineer or 
control technician. It is therefore clearly impossible to provide 
frequent manual monitoring and scheduled maintenance on 
all but the most critical control assets. 

Automated control performance monitoring, process 
monitoring, and fault diagnosis are rapidly developing fields, 
which have recently become available commercially. Our 
challenge is to make the best use of our scarce maintenance 
resources by incorporating these tools into our daily work 
systems. 

DEFINITION OF CONTROL PERFORMANCE 

The concept of control performance seems deceptively obvi- 
ous at first blush but grows more complex and multifaceted 
on closer inspection. This has not been helped by the com- 
mon practice of lumping virtually every process-control 
metric under the single category of control performance. 
The view expressed here is based on the belief that perfor- 
mance is a relative term that only has meaning when put in 
the context of objectives and constraints. And only those 
metrics that directly quantify adherence to those objectives 
and constraints should be referred to as performance met- 
rics. Other metrics belong under the categories of condition 
and diagnostic metrics. It is worth noting that ultimately all 
control performance objectives and constraints are derived 
from the objectives and constraints of the processes they are 
controlling. 

Control performance monitoring should be considered 
distinct from control performance audits. Audits imply an 
invasive open and closed-loop test procedure often with 
specialized high-frequency data collection and analysis. 
Audits require a detailed understanding of the underlying 
process objectives and constraints and will generally include 
a detailed review of the control-system configuration and 
standard operating procedures. Audits can be expected to 
produce a very precise and actionable list of remediation 
activities. Monitoring on the other hand implies that one or 
more statistics are continuously compared against a histori- 
cal performance benchmark looking for a significant shift 
away from acceptable or perhaps nominal performance. It 
provides a convenient way to track performance shifts over 
time and may give diagnostic clues to the underlying causes. 
However, the techniques reviewed here cannot be reasonably 
expected to replace the tools and skills necessary for a thor- 
ough understanding of control performance. 

The following definitions clarify the distinction between 
control performance metrics, control condition metrics, and 
control diagnostic metrics: 


• Control condition and diagnostic metric: Any statis- 
tic or characteristic of control that quantifies its adher- 
ence to a fundamental control objective or constraint. 
In the case of single-input-single-output (SISO) feed- 
back control (i.e., a single manipulated control element 
being used to control a single process measurement), 
the objective is to minimize control error (i.e., hold 
the process measurement close to its target value or 
setpoint) with a minimum amount of control action 
while in the presence of setpoint changes and external 
unmeasured disturbances. 

• Control condition metric: Any statistic or charac- 
teristic of control that does not measure control per- 
formance directly (as defined above) but provides 
valuable insight into other aspects of the condition 
of the controller. An example of this is the controller 
service factor , which we define as the percent of time 
that a controller is available for control. This is not a 
direct measure of the fundamental control objective of 
holding the process variable close to setpoint but it is 
an essential measure of the overall state of the control 
system. 

• Other metrics may point directly to a specific prob- 
lem or subset of problems. An example would be 
the use of the bicoherence statistic as an indicator of 
actuator problems. Diagnostic metrics provide spe- 
cific clues as to the underlying cause of a reduction in 
control performance or a change in a general control 
condition. 


CONTROL PERFORMANCE METRICS 

As defined in the prior section, control performance metrics 
are a direct measure of control objectives and constraints. In 
the case of standard SISO feedback control, these measures 
will directly quantify aspects of the controller error and con- 
trol action. A direct consequence of this is that these indices 
are only meaningful when the controller is active and able to 
influence the process. It is certainly useful to track other sta- 
tistics associated with those times that the controller is inac- 
tive (percentage of total available time that the controller is 
active, for example), but those indices should be considered 
as control condition metrics rather than control performance 
metrics. 

Control Variance as a Measure of Disturbance Response 

At one point or another every control technologist has con- 
sidered using a simple measure of variability, like standard 
deviation of a process measurement, as an indication of con- 
trol performance. However, that idea is quickly disqualified 
when we realize how dependent that metric is on control- 
ler disturbances and other external issues. Any meaningful 
metric must be based on a benchmark appropriate for the 
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FIG. 11.7 

Calculation of minimum variance from impulse response. 


variance control in the presence of an identical disturbance 
signal will have the same initial impulse (because of the 
delay in the system) which immediately drops to zero after 
the delay period. In other words, the minimum variance 
impulse response can be written as 

Jmv = a(t) + fci(t - 1) + • ■ • + f d a(t - d) (11.2) 

Here, y mv represents the deviation of the CV from setpoint 
in the presence of minimum variance control. This impulse 
response is truncated after d + 1 terms since all subsequent 
terms are zero. The variance of the minimum variance signal 
3'mv is given by 

Onu = (1 + f\ + fi 3 — )c>a (11.3) 

Desborough and Harris [6] in 1992 proposed the following 
index: 


specific conditions being imposed on the controller being 
evaluated. Astrom [1] is credited with proposing the use of 
the autocorrelation plot for just that purpose. He noted that 
the autocorrelation plot for a minimum variance controller 
(i.e., the theoretically optimal controller that would produce 
the minimum movement in the controlled variable [CV]) 
would drop immediately to zero out past the open-loop time 
delay. This concept was later exploited by Desborough and 
Harris [6] in the development of a performance index which 
estimates the ratio of actual controller variation to the mini- 
mum variance controller mentioned above. This index had 
a tremendous amount of appeal to practitioners because 
it could be estimated from normal closed-loop operating 
data and prior knowledge (or a reasonable estimate) of the 
open-loop dead time. The approach they developed uses 
the impulse response plot gleaned from the normal closed- 
loop data, an example of which is provided in Figure 11.7. 
Note that this is not the open-loop impulse response (the CV 
response impulses in the manipulated variable [MV]) but 
rather the closed-loop response to an unmeasured impulse 
disturbance. 

The impulse response is often expressed as a polynomial 
equation in discrete time: 


y = a(t) + M(t - 1) + ha(f - 2) + f 3 a(t - 3) + • • • (11.1) 


where 

y represents the deviation of the CV from setpoint 
{ait), a(t- 1), ait-2), . . .} represents a sequence of random 
impulses 

The impulse response plot shown in Figure 11.7 is the 
response to the impulse sequence {1,0,0,...}. 

The minimum variance control performance index 
is based on the convenient fact that the optimal minimum 


md)=i- 



(11.4) 


where a y and [i y represent the standard deviation and mean 
value of the control error y, respectively. These two terms are 
used instead of just o y in order to appropriately penalize the 
control performance score for a sustained offset i[l y * 0). This 
index has been criticized for being based on an unrealistic 
goal but nevertheless, it does provide a useful upper bound 
on performance. 

Several other “variance ratio” metrics have been proposed 
since Harris’ original work. One popular variation is to asso- 
ciate a specific response to the benchmark controller. Horch 
[8] proposed comparing actual variance to the variance of a 
benchmark controller with a first-order closed-loop response. 
The user must first provide a desired response in the form of 
a closed-loop time constant or settling time. This is becoming 
an ever more reasonable metric with the growing popularity of 
internal model control-based tuning rules such as those devel- 
oped by Chien and Fruehauf [4] in 1990. 

However, it is often stated that the variance ratio indi- 
ces can be misleading. A common criticism is that two very 
different signals can produce identical variance ratios (e.g., 
an oscillatory signal and overdamped signal). In fact, this 
issue is a matter of interpretation. Care must be taken with 
all these indices to not read too much into them and not look 
for a single index as the ultimate measure of performance. 
There are many facets to performance and these will require 
a combination of many indices. 


Controller Oscillation as a Measure of 
Unnecessary Control Action 

An oscillation is a motion that repeats itself at regular 
intervals. In the context of the controller’s output, oscilla- 
tions generally represent unnecessary control action and 
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can therefore be considered a performance metric although 
it could be argued that they should be classed as a control 
condition metric. Regardless, unnecessary oscillations rep- 
resent an opportunity for improvement. Oscillations can be 
measured in terms of their (i) frequency or period, (ii) their 
statistical significance, and (iii) contribution to the total over- 
all variation. 

Several approaches for identifying oscillations exist, 
such as 


where 



f 

K 


n 

J f(x)cos(nx)dx, 

-n 



n 


n 

J f(x) sin(nx)dx 

-n 


1. Isolating a peak frequency on the power spectrum 

2. Isolating a peak frequency or decay ratio on the auto- 
correlation plot 

3. Identifying periodic axis crossings of time series data 

4. Identifying periodic axis crossings of autocorrelation 
plot 

Detecting and quantifying oscillations using Fourier analysis 
and the power spectrum are two methods frequently used. 
The Fourier analysis uses the Fourier series (shown below) to 
decompose a time series into an equivalent collection of sine 
and cosine functions. The power-spectrum plot is derived 
from the resulting Fourier series and represents the contribu- 
tion of each frequency to the overall signal variance. It also 
represents the Fourier Transform of the autocorrelation plot, 
to be discussed later. In other words, the location of each 
peak in the power-spectrum points to the dominant frequen- 
cies in a signal and the area under the power-spectrum curve 
tells us the relative contribution of that frequency band to the 
overall level of variations: 


1 °° °° 

f(x) = — a 0 + a n cos (nx) + b n sin (nx) (11.5) 


n = 1 


n = 1 


The power spectrum of some common wave forms can 
complicate the analysis at times. The simplest example of 
this is the power spectrum of the square wave (Figure 11.8). 
Even if there is a single oscillation, still the power spec- 
trum can show the original frequency along with a series 
of harmonics. This behavior is an artifact of the difficulty 
of fitting a nonlinear wave form, like a square wave, to a 
series of sines and cosines and is often cited as a criticism 
for using the Fourier series for describing oscillations in 
time series data. 

This type of criticism has led to the preference of other 
techniques that keep track of the regularity of axis cross- 
ings of either the raw signal or the autocorrelation signal. 
The autocorrelation function tends to be preferred over the 
raw signal since it filters out random elements of the signal 
while maintaining the form and character of the underlying 
oscillations. 

Thornhill (2003) has proposed an index which is the ratio 
of the average period between axis crossings to the standard 
deviation of the period between axis crossings of the autocor- 
relation function. She further notes that this ratio follows an 
exponential distribution that can be used to test the statis- 
tical significance of the oscillation. This index is attractive 
in that it provides a rigorous distinction between a random 
signal and an oscillatory one. Unfortunately, the use of axis 


Time buffer 
4 



-4 

0 50 ms/div 250 m 


(a) 



FIG. 11.8 

(a) Time series plot of 115 Hz square wave and (b) power-spectrum plot of 115 Hz square wave showing peaks at original frequency and 
each odd harmonic (3 x 115, 5 x 115, etc.). 
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crossings to detect oscillations cannot be easily extended to 
detecting multiple oscillations. 

CONTROL CONDITION AND DIAGNOSTIC METRICS 

As stated above, these are important metrics that quantify the 
condition of the controller but do not relate directly to their 
performance in terms of the magnitudes of controller error 
or the resulting control action. The number of indices in this 
category is large and growing with each new researcher and 
vendor adding his or her unique spin to the existing collec- 
tion. This section will focus on the most common and useful 
measures of control condition. 

Service Factor and Effective Service Factor 

The McGraw-Hill Dictionary of Scientific and Technical 
Terms [10], 2003, define Service Factor for a chemical or a 
petroleum processing plant or its equipment as “the measure 
of the continuity of an operation, computed by dividing the 
time on-stream (actual running time) by the total elapsed 
time.” Opinions differ whether to include the time that a con- 
troller is available for control, but not actually actively con- 
trolling, in the service factor. For instance, should the service 
factor include time that a controller is in the correct mode and 
available for use while downstream controllers are not in the 
correct mode? Should time when the output is saturated be 


included? The following definitions are provided in an effort 
to add clarity to these metrics. Total clock time is factored as 
shown in Figure 11.9. 

The following definitions are presented: 

1. T is the total clock time. 

2. T 0 is the total time the controller’s unit was operating 
and the controller was potentially required. 

3. T a is the total time the controller was available for 
control; usually just means the controller was in an 
active mode (automatic or cascade mode). 

4. T c is the total time the controller had access to the 
downstream control elements and was able to influ- 
ence the process; times when downstream controllers 
were unavailable or selectors are not passing through 
the control signal are deducted from this time. 

5. T d is the total time the controller is actually influenc- 
ing the process; times when the output signals are 
saturated high or low are excluded from this time. 

With the following conceptual definitions for service factor 
(SF) and effective service factor (SF E ): 

SF is the percentage of available unit time during which 
the controller was available for control. Typically, this means 
the controller was in an active mode; either automatic or 
cascade. 

SF e is the percentage of available time during which the 
controller was actively controlling the process. This is a subset 
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FIG. 11.9 

Factored total clock time. 
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of the time included in the SF calculation and excludes those 
times when downstream elements were unavailable or the out- 
put was saturated at its fully opened or fully closed position. 

According to these definitions: 

SF = — xl00%, SF e =— x100% ( 11 . 6 ) 

To T q 

In general, a low SF may point to a problem with the control- 
ler or it may be an operator training issue. High SF combined 
with a consistently low SF E may imply that this controller is 
not needed and the overall strategy should be reviewed. 

Output Saturation 

A controller with its output at either the high limit or low limit 
is considered saturated. At this point, the controller is stuck 
and only able to respond to disturbance in a single direction. 
According to the definitions provided in the previous section, 
a controller in this state is considered in service but its SF E 
would be reduced. Output saturation is generally considered 
bad since it represents a reduction in degrees of freedom and 
potential loss of control. It may be caused by poor valve siz- 
ing, a process problem such as equipment fouling or blockage 
or it may be due to an inappropriate setpoint choice by the 
unit operator. However, it may also be considered optimal 
since optimization strategies tend to move processes to an 
operational limit or other constraint. Care must be taken to 
interpret the incidence of output saturation correctly. Output 
saturation percent is given by 


OPSat% = ^^xl00% ( 11 . 7 ) 

To 

where T sat represents the subset of time that the controller 
had access to all downstream elements but had its output 
at either the high or low limit (T sat = T D - T c as defined in 
Figure 11.9). 


Valve Travel 


Total valve travel is intended to represent the total linear 
movement of a sliding stem valve or angular movement of 
a rotary actuated valve. A simple formula for valve travel is 



( 11 . 8 ) 


where 

Lj is the total valve travel, and 

represents the absolute change in output from 
one sample to the next. 

This will only approximate the true valve travel when the 
sample frequency is fast relative to the rate of change of the 


output and frequency of direction changes. The general belief 
is that this movement is one of the factors responsible for 
valve wear and will reduce the useful life of a valve. 

Alarm Count and Operator Intervention Count 

A fundamental concern for safety motivates production facil- 
ities to identify and correct those controllers that demand a 
great deal of attention from plant operators. Alarm man- 
agement standards such as the EEMUA 791 (Engineering 
Equipment and Material Users Association) and ISA 18.2 set 
strict limits on the number of alarms an operator should be 
subjected to in a given period of time so as not to dimin- 
ish their ability to react to truly critical situations when they 
arise. Similarly, controllers that require constant direct inter- 
vention by the operator have a negative impact on process 
reliability and safety. Of course, plants must also investigate 
controllers with excessive alarms and operator interventions 
because they are probably being caused by underlying con- 
trol problems or the need for an improved control strategy. 

Control diagnostic metrics are a subset of control condi- 
tion metrics. This category, however, is reserved for those 
metrics that point to a specific underlying cause or subset 
of causes. The following are several diagnostic metrics men- 
tioned in academic literature and included in several com- 
mercial control performance monitoring packages. 

Signal Nonlinearity 

A nonlinear signal is a time series that cannot be produced by 
passing a random signal through a linear filter. Fortunately, 
most real process signals can be made approximately linear 
by virtue of the fact that they are generally controlled to a 
single operating point. Even traditionally nonlinear processes 
(e.g., pH control) can have essentially linear measurement sig- 
nals around a single operating point. It has been well known 
for some time that vibration signals from gearboxes and rotat- 
ing equipment can begin to show signs of nonlinearity when 
the equipment is damaged [13]. Detection of this nonlinearity 
is often done by monitoring higher moment statistics (such 
as the “bispectrum” or “bicoherence”) that are usually neg- 
ligible for linear signals but become more prominent as the 
signal becomes more nonlinear. Choudhury [5] was the first 
to apply this to the detection of valve problems. The strategy 
uses a higher order extension of the concept of power spec- 
trum. The power spectrum shows the power or variance of a 
signal at each frequency. The bispectrum includes a term for 
interaction between two frequencies and bicoherence is a nor- 
malized version of the bispectrum. An interaction between 
frequencies has been related to nonlinearities in the signal. 
Specifically, linear non- Gaussian (skewed) signals will have 
a constant bicoherence at all frequencies. Choudhury ’s non- 
linearity index (NLI) measures the statistical significance of 
the difference between the maximum and mean bicoherence 
looking for signs of a significant peak. A large coherence 
peak indicates severe signal nonlinearity, which in turn is a 
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Time trend of error signal 
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FIG. 11.10 

Linear and nonlinear oscillating controller error. 


strong indication of actuator problems. The examples shown 
in Figure 11.10 illustrate this. The first three examples show 
a normal control error trend, one oscillating due to aggres- 
sive tuning, and one oscillating due to an oscillatory external 
disturbance, respectively. These three cases show negligible 
bicoherence. The fourth example is oscillating due to the 
severe nonlinearity caused by valve stiction and shows sig- 
nificant peaks in the bicoherence plot. 

Value Stiction Metric 

Stiction is an informal process-control term referring to 
static friction in valves. Excessive static friction causes a 
valve to resist movement and then jump quickly to a new 
location when the load provided by the actuator is enough to 
overcome the friction. This behavior typically results in the 
sort of limit cycle or stick-slip cycle depicted in Figure 11.11. 

Identification of valve stiction is a complex topic which 
often requires a combination of statistical techniques and 
pattern recognition. Several techniques assume a specific 


functional form for the inherent nonlinearity of a valve and 
then attempt to observe or identify the magnitude of stiction 
using standard system identification techniques. These tech- 
niques suffer from the true diversity of problems found with 
valves, which often do not lend themselves well to simplified 
model forms. Another technique uses the cross-correlation 
between the controller output and the process variable. Under 
limiting assumptions the autocorrelation function for an oscil- 
latory controller will be an “even” function or symmetrical 
with respect to the vertical axis (i.e.,/(v) =f(-x)) if the oscilla- 
tion is caused by an external disturbance or poor tuning, and 
an “odd” function or asymmetric with respect to the vertical 
axis (i.e.,/(v) = -f(-x)) if the oscillation is due to a severe non- 
linearity in the loop. This is based on limiting assumptions 
that the loop is truly oscillating, the controller has significant 
integral action and the process being controlled is neither an 
integrating nor a compressible fluid. Once again, the limiting 
assumptions severely restrict the usefulness of this algorithm. 
A more successful but complex algorithm uses the oscilla- 
tion and nonlinearity metrics based on high-order statistics 
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FIG . 11.11 

Valve limit cycle. 

to identify the presence of a severe nonlinearity in the control 
loop. The presence and magnitude of valve nonlinearity is 
then quantified by fitting the PV-vs.-OP plot to a characteris- 
tic ellipse and measuring the width as shown in Figure 11.12. 

Error due to Stiction 

It is perhaps more meaningful to quantify stiction based on 
its contribution to variability of the controlled signal. One 
approach is to filter the controller error signal (SP-PV) with 
a band-pass filter corresponding to the frequencies identified 
from the bispectrum analysis discussed in the signal nonlin- 
earity section. The error due to stiction is then calculated as 
the standard deviation of this filtered controlled error signal. 

Setpoint Activity 

Setpoint movement or activity is an important factor in 
feedback control performance. Linear feedback control of a 
linear process can be decomposed into the sensitivity and 
complementary sensitivity transfer functions. The dynamic 


characteristics of these represent the classic trade-off for 
control tuning; tune for optimal disturbance response vs. 
tune for optimal setpoint response. Multi-tiered control 
strategies work by passing variance down to the lower lay- 
ers in the form of setpoint changes. No control monitoring 
system would be complete without a measure of setpoint 
activity, of which there are several. The simplest metric is 
the standard deviation or variance of the setpoint. A more 
powerful measure is the cross-correlation or cross-spectral 
density between the setpoint and the controller error [14]. 
These measures provide information on the peak correlation 
as a function of signal lag or frequency. Spectral coherence 
is a normalized version of the cross-spectral density and cor- 
responds to the Fourier transform of the cross-covariance. 
Several indices can be developed from these core calcula- 
tions and concepts. One such index is the setpoint activity 
index based on the coherence between the control error sig- 
nal (SP-PV) and each of the individual inputs; PV and SP. If 
the index is defined as 


Setpoint activity = y eSF ~ Ye,PV (11‘9) 
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PV vs. OP ellipse for severely nonlinear actuators, (a) original data, (b) original data overlaid with best-fit ellipse, length of segment A 
is reported as observed stiction. (Taken from Choudhury, M.A.A.S. et al., Detection and quantification of control valve stiction. With 
permission.) 


where Y^sp and Ye,pv are the mean coherences between the 
error signal and the setpoint and process variable, respec- 
tively. Mean coherence will vary between 0 for no correlation 
between signals and 1 for complete correlation. Therefore, 
the setpoint activity as defined above will vary from -1 for 
the limiting case of a constant setpoint to +1 for the opposite 
limiting case of all control error being determined by set- 
point moves. 

Setpoint activity must be interpreted carefully. In gen- 
eral, if the loop under review is the inner loop of a cascade 
structure or the MV in a supervisory control scheme, then we 
should expect a low and preferably negative setpoint activity. 

MODEL PREDICTIVE CONTROL PERFORMANCE MONITORING 

Model predictive control (MPC) has become synonymous 
with advanced process control (APC) in the chemical pro- 
cess industries and much of the research around control 
performance monitoring has shifted toward multivariable 
controller monitoring in recent years. A detailed descrip- 
tion of MPC is beyond the scope of this chapter but the 
interested reader can refer to any of the growing number of 
books and websites dedicated to this subject. The key ele- 
ments of MPC are: 

1. A dynamic model of the open-loop multivariable pro- 
cess predicts the future value of process outputs based 
on current and past moves made to the controller’s 
MVs 

2. An objective function that combines the various ele- 
ments of the control solution: predicted gap between 
the CVs and their targets weighted by their relative 
importance, proposed changes to the MVs and their 


associated movement penalties, other terms depend- 
ing on the MPC technology chosen 

3. Constraint limits on the CVs and MVs 

The MPC uses an optimizer to find the optimum set of con- 
trol movements that minimize the objective function while 
remaining within the control variable and MV constraint 
limits. MPC is sometimes referred to as receding horizon 
control since only the first step of the control solution is 
implemented and then the entire optimization calculation 
is repeated on the next iteration. The variables involved in 
prediction and control are classified as CVs, MVs, and dis- 
turbance variables (DVs). CVs with only constraint limits are 
occasionally considered separately as limit variables. Every 
commercial MPC algorithm follows these general steps: 

1. Compare the current predicted CV values to the actual 
measured values and calculate the offset. Update the 
prediction bias based on this most recent offset. 

2. Calculate an optimal MV trajectory based on the 
objective function and the MV and CV constraints. 

3. Implement the first step of the set of optimal control 
moves and wait for the next control iteration. 

4. Return to step 1. 

MPC CONTROL PERFORMANCE METRICS 

There has been an understandable tendency to repeat the 
history of regulatory control performance monitoring by 
creating a multivariate minimum variance index that can 
be calculated using only normal operating data. That is an 
appealing but significantly more complicated approach 
in a multivariate environment. The reason for this is the 
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complexity of describing dead time in a multivariable sys- 
tem. The multivariate equivalent to dead time is the inter - 
actor matrix. Other techniques for estimating minimum 
variance without prior knowledge of the interactor have been 
proposed along with strategies that use moving horizon for- 
mulations for minimum variance control. Linear quadratic 
Gaussian control has also been proposed as a benchmark. 
None of these strategies has been implemented in commer- 
cial monitoring systems because of their complexity and 
need for extensive knowledge of the entire process model. 

The challenge of coming up with a useful multivariate 
variance-based metric in practice has led to a tendency toward 
customized calculations for MPC control performance. These 
calculations draw from the original objective of the control- 
ler; plant throughput, quality, energy, and chemical consump- 
tion as the basis for performance. One general observation that 
holds true for most model predictive controllers in the chemical 
process industry is that the optimum is found at an intersection 
of constraints. This is shown schematically in Figure 11.13. 

MPC CONDITION METRICS 

The ability to explicitly manage constraints is arguably the 
primary reason MPC has enjoyed such widespread accep- 
tance in the chemical process industries. One of the primary 
ways that this control strategy meets its performance objec- 
tives is to hold the process as close as possible to an optimal 
set of process constraints or limits without exceeding them. 

CV Limit Activation 

These are generally reported as the percentage of active con- 
troller time that the CV steady- state target was at its high or 


low limits within a small allowable tolerance. This situation 
is displayed graphically in Figure 11.14. 

This graph shows the steady- state CV target starting 
between the high and low limits then moving to the low limit 
for time T AL , then violating the low limit for time T y and 
then ending at the high limit for time T An . A CV limit is 
considered active if the steady-state target is within a dis- 
tance tolerance of 8 from it. According to these definitions, 
the low limit activation (A L ) and high limit activation (A H ) 
are given by 

A l =^x100% (11.10) 

To 

A H =— x 100% (11.11) 

To 

CV Steady-State Give-Up 

The control variable constraints are not true constraints inso- 
far as the objective function is formulated in such a way that 
the optimal solution may occasionally violate CV limits. We 
use the term constraints for these limits to reflect the fact 
that they are very heavily penalized in the objective formula- 
tion and should be avoided. That being said, advanced con- 
trollers require the flexibility to “give-up” on lower priority 
constraints in order to prevent violation of higher priority 
constraints. This constraint give-up is a critical performance 
measure and should be monitored closely. Using the vari- 
ables shown in Figure 11.13 steady- state limit give-up (V ss ) 
is given by 

V ss = — xl00% ( 11 . 12 ) 

To 



0 50 100 150 200 250 300 350 400 450 500 

FIG. 11.14 

Time trend ofCV limit activation and steady -state give-up. 
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FIG. 11.15 

Steady-state give-up vs. violation: (a) the highlighted region shows a period of steady -state give-up where the optimal solution ( indicated 
by the dark line ) requires violation of the CV low limit, (b) the highlighted region shows a period of constraint violation where the optimizer 
has predicted that the CV would lie on the upper constraint but the actual measured response exceeded the constraint. 


CV Limit Violation 

There is a subtle but important distinction between a con- 
straint limit give-up and a limit violation. As described 
above, a steady-state give-up is part of an optimal control 
solution in the future, which may or may not actually happen. 
The term violation is reserved for the time the CV actually 
spends outside of the configured limits. Refer to Figure 11.15 
for a comparison. In the top figure, the actual limit violation 
is consistent with the planned limit give-up. In the lower fig- 
ure, there was a period of time when the CV unexpectedly 
violated the limit. These violations can be tracked using stan- 
dard statistics such as the average limit violation and the peak 
limit violation. 

CV Prediction Quality 

The quality of CV predictions is one of the most critical 
aspects of MPC. The ability of the MPC to predict future val- 
ues of the CVs lies at the heart of managing the constraints 
in an optimal fashion. A poor predictor will force the control 
engineer to set the CV limits well within the true constraints. 
At best this will produce a sub -optimal control solution 
and at worst it will perform worse than manual control and 
eventually be turned off. The unbiased CV predictions can 
be studied using spectral and time series analysis to iden- 
tify and diagnose model quality issues. These unbiased CV 
predictions are one step ahead forecast errors or prediction 
residuals as defined by Box and Jenkins [3]. Their textbook 
dedicates an entire chapter to “Model Diagnostic Checking” 
and describes several techniques involving the model residu- 
als. The first of these uses the autocorrelation of the residu- 
als. By definition, a nonzero autocorrelation means that a 
portion of the signal can be predicted using prior measure- 
ments. Therefore, it follows that a good prediction will be 
characterized by a residual with little or no autocorrelation as 


□ 


Autocorrelation of prediction error 



FIG . 11.16 

Autocorrelation of prediction error. 

is shown in Figure 11.16. Here, the autocorrelation function 
starts at 1 (since a signal is perfectly correlated with itself) 
but then immediately drops to insignificant values. 

Figure 11.17 shows the power spectrum for the actual 
measured CV (often referred to as a “process variable” or 
PV) and the unbiased prediction. A mismatch in the low-fre- 
quency end of the spectral plat is an indicator of an improper 
model gain. A mismatch in the middle-to-high frequency 
region of the plot is an indication of potential problems with 
the model’s time constants or time delays. 

Box and Jenkins [3] go on to describe ways to use cross-cor- 
relation between the model inputs and residuals to identify the 
potential source of model error. This technique can be applied 
to MPC model diagnostics by calculating the peak cross-cor- 
relation value between the prediction residual and the MV and 
DV. Table 11.1 shows an example of a diagnostic table from a 
commercial MPC monitoring package. It shows a sorted list of 
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FIG . 11.17 

Spectral analysis of CV and unbiased prediction. 


TABLE 11.1 

Example Showing the Use of the Cross - 
Correlation of Prediction Residual to MV 
for Model Error Diagnosis 



Correlation 

Gain 

FIC-2001.SP 

-0.61 

-0.28 

TIC-2003. SP 

-0.23 

0.15 

FIC-2002.SP 

0.21 

0.46 

FIC-2004.SP 

0.07 

-0.27 


MV/DV correlation with pred. err. 


the variables with the largest correlation with model error. The 
process gains taken from the MPC models are provided along 
with the peak correlation values for convenience. In general, 
we can expect that the modeled process with the largest true 
gain would be the most sensitive to model errors although this 
is only one factor and should not be overstated. 

MV Limit Activation 

In most cases, an active MV Limit is not considered desir- 
able since it represents a loss in degrees of freedom (i.e., the 
controller is less capable of responding to disturbances or 
moving the CVs to their limits). There are exceptions, how- 
ever, where the optimal solution incorporates running an 
MV at its constraint. Consider, for example, the case where 
the loads on several steam powered turbines are the MVs 
used to control pressure on a steam header. The optimal 
solution will adjust the load on the least efficient turbine to 
respond to changes in steam production or demand while 
keeping the most efficient turbine running at maximum load 
or its high MV limit. In either case, MV limit activation is 
a valuable metric. The definitions of these limit activation 


metrics are equivalent to the corresponding CV metrics. The 
MVs do not require limit violation metrics since the model 
predictive controller will never move the MVs outside their 
specified limits. 

BUSINESS PERFORMANCE 

An ongoing challenge for process control is to link control 
performance (or the loss in control performance) as defined 
earlier to economic or business performance. These do 
not necessarily need to be defined in monetary terms but 
they should relate directly to the core business objectives 
of the process area being monitored. This relationship is 
necessary to justify the initial cost of purchasing and com- 
missioning a control performance monitoring system, to 
justify the ongoing effort to maintain the system and inte- 
grate its findings into the site maintenance work practices, 
and finally to prioritize the daunting number of problems 
that these systems will identify. Conceptually, a business 
performance metric can either measure the realized benefit 
relative to a performance baseline or measure the unreal- 
ized opportunity relative to a performance target. The real- 
ized benefit metrics are useful for showing the value of past 
investments and the need for continued maintenance; the 
unrealized opportunity metrics show the opportunity for 
further gains. 

A partial list of the business level objectives for process 
control is listed below: 

• Stabilize unstable open-loop processes. 

• Make it possible to operate modern processes which 
are far too complex to be operated manually. 

• Increase productivity as measured by output per oper- 
ating employee. 

• Improve process reliability and overall equipment 
effectiveness. 

• Improve process safety by removing distractions. 

• Reduce or eliminate off-specification product. 

• Reduce or eliminate process emissions and effluent. 

• Minimize energy and raw material usage. 

• Operate against optimal constraints. 

Defining and quantifying the business value of control perfor- 
mance is tantamount to defining and quantifying the business 
value of process control itself. There are numerous published 
anecdotes describing the benefits of control performance and 
several focused studies. An often quoted study led by Marlin 
et al. [9] in 1991 proposed a methodology for estimating the 
benefit of process-control technology and demonstrated the 
approach with seven industrial case studies. The research- 
ers were able to identify potential benefits in the range of 
1.4%-6% of operating costs. Brief accounts of the proposed 
business benefits are shown in Table 11.2 along with a brief 
description of the associated control improvement. The key 
observation was that the proposed control improvements are 
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TABLE 11.2 

Business Benefits of Process Control from Industrial Case Studies 


Industrial Case Study 

% Yield 
Increase a 

% Energy 
Reduction 

Production 
Increase a 

Staffing 

Reduction 11 

Deferred 
Capital a 

Other 11 

Sugar refinery 

5 

62 


25 


8 

Petroleum refinery 

87 

13 





Power plant 


100 





Oxo plant 

2 

1 

97 




Sewage treatment plant 


9 



90 

1 

Alumina refinery 


100 





Vinyl chloride plant 

44 

4 

30 



22 

a All entries as percentages of base case benefits. 
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Control performance 
metrics 


Control condition and 
diagnostic metrics 


Safety 
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Ability to hold setpoint 
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control constraints 
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Oscillation 

Service factor Nonlinearity Alarm count 

Disturbance 

response Standard statistics (mean, variance) 


FIG . 11.18 

Objective monitoring hierarchy. 

generic and lend themselves well to a generic monitoring 
package that tracks variability, the ability to track a setpoint, 
the ability to reject disturbances, etc. However, the business 
benefits are not generic and do not lend themselves well to 
generic monitoring strategies. 

Several vendors of control performance monitoring soft- 
ware include an economic metric to prioritize control prob- 
lems. On closer examination these economic measures reveal 
themselves to be a combination of control performance met- 
ric and user specified economic weighting factor. The user is 
expected to supply the weighting factor, which can be a daunt- 
ing task in plants with many thousands of controllers, and the 
validity of such a broad brush approach is questionable. 

Objective-Monitoring Hierarchy 

A preferred approach is to create a structured objective-mon- 
itoring hierarchy based on engineering knowledge as shown 
in Figure 11.18. 

Business-level objective metrics are found at the top 
level. Those factors known to directly influence the top-level 
metrics are collected in the next level. At a minimum, this 
will include the controller performance metrics for those 
controllers that directly influence the top-level objectives. 
This level may also hold other process metrics know to be 
key influencers of the top-level metrics. Additional tiers can 


be placed below the second level although excessive com- 
plexity will undermine the usefulness and maintainability of 
this monitoring strategy. Controllers are then weighted based 
on their overall level of influence on the top level business 
performance metrics. This can be determined analytically 
through simulation, based on process knowledge or through 
multivariate statistical analysis of historical data. 

Alternative weighting may be based on the criticality of 
the loops performance to the overall safety and integrity of 
the unit. 

CONTROL PERFORMANCE MONITORING AND 
CONDITION-BASED MAINTENANCE 

A study was conducted in 2003 by McNabb [11] to quantify 
the accuracy of commercially available control performance 
monitoring and diagnosis technology. Thirty-two controllers 
were manually audited to document all performance issues 
and their causes. Next, four vendors were invited to collect 
several weeks of normal plant- operating data and use that data 
to identify performance issues and diagnose the root cause. 
Table 11.3 summarizes the results from the selected vendor. 

Each vendor was allowed to manually review their find- 
ings and submit a written report along with any diagnostic 
suggestions they may have. It was found that the chosen 
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TABLE 11.3 

Agreement between Automated Performance 
Assessment and Manual Audit 


Performance Metric 


Score 

Detailed assessments 

Overall 

100% 


Diagnostic comments 

74% 

Automatic assessment— 

-no user analysis 

90% 

Oscillation detection 


76% 

Stiction detection 

% Identified 

83% 


False positives 

2 


Score = percent of loops in agreement. 


vendor’s expert, using their software analysis tools, was able 
to identify all the loops with significant performance issues. 
Furthermore, their best guess at the root cause were helpful 
roughly three out of four times. Their software was able to 
correctly identify poor performance without manual expert 
intervention 9 out of 10 times. Persistent oscillation and sig- 
nificant stiction were correctly identified 75% and 83% of the 
time, respectively. The decision was made to purchase and 
install control performance monitoring software and proceed 
with developing a work system that takes advantage of the 
strengths of the technology while avoiding the weaknesses. 
It is a fair statement that the effort to evaluate and select a 
control performance monitoring solution is just a preamble 
to the real work: integrating control performance monitoring 
into the maintenance work processes. 


Control performance monitoring software is a passive 
technology and cannot add value without being integrated 
into the daily work processes. The first step is to identify 
those individuals who currently prioritize and execute con- 
trol loop maintenance work. Ideally, the maintenance pro- 
cess is consistent throughout the site and has been formally 
documented but in many cases the process is an informal 
cooperation between an instrumentation supervisor and 
someone from the production department. Nevertheless, the 
first step is to understand who does the work and how daily 
maintenance priorities are set. Once the key players and their 
existing work processes have been identified, the processes 
can be modified to take advantage of the condition monitor- 
ing software. The technology is quite reliable at identifying 
problem loops but it is less precise at diagnosing the spe- 
cific issue. Furthermore, follow-up activity will depend on 
the specific nature of the performance problem and the root 
cause. 

Several alternative work processes have been attempted 
with varying levels of success. The various approaches are 
listed in Table 11.4 along with their relative advantages and 
disadvantages. Although the ideal situation is for active 
involvement and oversight by the production organization, 
the reality is that they may need solid evidence at their 
site that this program can add value and increase their unit 
performance. It is probably more realistic to begin with an 
individual-driven program or auditing study to quickly dem- 
onstrate value before moving to a longer term sustainable 
solution. 


TABLE 11 A 

Alternate Approaches for Working with Control Performance Monitoring Software 


Approach 

Advantages 

Disadvantages 

Recommendation 

Individual 

Least complicated approach 

Hard to sustain benefits 

Use this as a short-term approach to 

champion 

Probable short term benefit 

Program stops when champion moves on 
Dependent on skills/credibility of single 
champion 

demonstrate the concept 

Targeted control 

Very focused approach 

Hard to sustain benefits 

Use this approach to generate significant 

performance 

Virtually assures demonstrable 

Labor intensive and expensive 

benefits as part of a larger capital 

audits 

benefits 

Requires multi-week fulltime 

project, advanced control project, or 


Way to incorporate tools into larger 
capital and optimization projects 

commitment from a core team 

optimization effort 

E&I maintenance 

Leverages existing maintenance 

May be viewed as “extra” work 

Works best with E&I departments already 

driven program 

workflows 

Reliability is more of a priority than 

engaged in predictive/preventive 


Can displace certain scheduled 
maintenance activities 

performance 

maintenance programs 

Process-control 

Leverages group with great skill set 

Pulls process-control engineers into daily 

Works best with large process-control 

specialist driven 

for this activity 

maintenance activities; less time 

departments that are already involved in 

program 

Not difficult to get commitment 

available for advanced control and 
optimization projects 

daily control asset maintenance activities 

Production 

This group reaps most of the benefit of 

Often not convinced that control 

This is the ultimate goal; start with an 

management 

control performance improvements 

performance is their most pressing issue 

easier option if necessary but work 

driven program 

Production department sets 

Not comfortable with process-control 

toward production management 


maintenance priorities 

technology and terminology 

oversight 
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Regardless of the approach taken, a prerequisite for suc- 
cess is a shared understanding of roles and responsibilities. 
This is described in more detail in the following section. 

Roles and Responsibilities 

Table 11.4 makes it clear that there are several ways to orga- 
nize a control performance management program. However, 
there a few general statements that applies to the organiza- 
tion of most plants. Typically, the base layer controls are 
supported by a group of technicians, engineers, operators, 
and managers who tend focus on different facets of control 
performance: 

• Plant manager: The plant manager is primarily con- 
cerned with the overall uptime of the process-control 
assets; these represent a significant capital and main- 
tenance cost and are not adding value if they are not 
being used. 

• Shift supervisor/production-maintenance coordina- 
tor: The shift supervisor is concerned with the amount 
of operator intervention required. Too much attention 
paid to a small number of controllers undermines the 
operator’s ability to manage operation of the process 
as a whole. One or more supervisors typically serve 
a dual role as Production-Maintenance Coordinator. 
They have the additional responsibility of working 
with the maintenance department to prioritize main- 
tenance work items. 

• Unit operator: The operator needs to trust that the 
controller is not going to get him into trouble; he is 
concerned with robustness and stability and less with 
uptime and optimization. 

• Production engineer: The production engineer must 
achieve the production targets that have been set by 


the planning/scheduling team and plant manager. 

He is most concerned with unit-wide disturbances or 
oscillations and incidents resulting in quality and pro- 
duction losses. 

• Process engineer: The process engineer is involved in 
the design phase of the process-control application but 
typically only gets involved with control performance 
maintenance if they suspect the problems are caused 
by, or causing problems to, the underlying process 
behavior. 

• Technical or process-control manager: The technical 
or process-control manager is concerned with applica- 
tion uptime and documented benefits. Issues requiring 
additional resources will be brought to his attention 
as necessary. 

• Process-control engineer: Virtually every control 
performance concern or issue ultimately gets passed 
to the area process-control engineer. His focus will be 
on identifying the cause and communicating that to 
key stake holders. It is important to note that in many 
cases the problem will not actually be corrected. It 
may be found to be a perception issue that requires 
training or clarification or it may be a problem that is 
deferred because of cost constraints or other conflicts. 
Regardless, it is the process-control engineer’s job to 
manage these and communicate the follow-up plan to 
the stake holders. 

• E&I technician: He focuses on equipment repair and 
replacement once a control performance problem has 
been identified. Some sites have senior technicians and 
specialists whose experience and expertise enables 
them to act as de facto process-control engineers. 

An example work process is shown schematically in Figure 

11.19. This workflow process shown is based on the concept 

of control loop “Dispositions.” 



Diagnostic technician has improved training and 
equipment to diagnose faults flagged by 
software 


Process control 
support as needed 


FIG. 11.19 

Control performance management work process. 
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TABLE 11.5 

Disposition Options and Associated Responsibility 


Disposition Who Is Responsible ? 

What Is Expected? 


Requires attention 

Production-maintenance 

coordinator 

False positive 

Process control 

Capital request 

Engineering 

pending 

Maintenance work 

E&I 

order pending 

Deferred 

Production 

Assess problem 

Production 

Resolved 

Production 


This system has identified this loop as having problems. Review the performance scores and DCS 
trends, talk to operators, etc., then either fix an obvious problem or select one of the other five 
dispositions 

The controller performance looks acceptable. Process control will correct simple configuration 
problems or contact vendor to correct more complicated issues 

This problem needs re-engineering or capital expense. Issue a capital request 

This is a control equipment or tuning problem. E&I will diagnose the problem and take appropriate 
action 

There is a performance problem but the cost and effort to correct it is not justified at this time 

More study is required before the problem can be assigned to another group 

Production should close out the follow-up item by setting the disposition to “Resolved” and close any 
associated work orders when work is complete 


The idea is that on a regular frequency, perhaps weekly, 
the monitoring system identifies poor performing loops 
and gives them the disposition “requires attention.” The 
production-maintenance coordinator then reviews the list 
and assigns it a follow-up disposition. These dispositions 
and their associated responsibilities are summarized in 
Table 11.5. 

Role of the Production-Maintenance Coordinator 

In an ideal world, the control performance monitoring soft- 
ware would not only identify and diagnose problems but 
also prioritize them based on their impact on the business. 
Unfortunately, the technology has not developed to that point 
and we still require a person with knowledge of the process 
to prioritize control problems. 

One approach is to have knowledgeable individuals 
go through the entire list of controllers and weight them 
according to their importance to the overall process. 
Unfortunately this exercise is extremely time consuming 
and somewhat arbitrary. A more practical approach is to 
have the area Production-Maintenance Coordinator peri- 
odically scan the list of loops that the system has flagged 
as requiring attention and choose an appropriate follow- 
up plan. An example decision tree/workflow is shown in 
Figure 11.20. 

This should be done once a week at a minimum and 
should require less than 1 h even for a large plant. At this 
point, follow-up responsibility is passed to one of the remain- 
ing departments unless he chooses to set the disposition to 
“assess problem” pending a more detailed review of the 
problem. Note that this initial review by the production- 
maintenance coordinator is intended as a coarse filter to 
avoid chasing after false positives or working on low prior- 
ity issues. It does not require careful analysis for this initial 
screening nor does it require any particular process-control 
expertise. 


The work process described here is just one of many pos- 
sible solutions. Many other processes will succeed if they are 
developed with the following considerations: 

• The workflow must be flexible, well defined, and as 
simple as possible. 

• There must be a single individual accountable for each 
stage of the process recognizing that in most cases it 
will be a different individual for each area of the plant. 
Any workflow process will fail if there is uncertainty 
around follow-up responsibility. 

• There must be accountability. We all have more work 
to do than there are hours in a day. This process will 
fail if it is not monitored regularly by senior and mid- 
level managers. 

• And finally, a well-designed control performance 
management work flow will change how you do your 
work currently but it must not add additional work. If 
implemented correctly, it will begin to pay dividends 
in terms of process performance, process reliability, 
and individual productivity. 

SOFTWARE REQUIREMENTS FOR ENTERPRISE-WIDE 
CONTROL PERFORMANCE MONITORING 

True control performance metrics should be a direct reflec- 
tion of the control objectives and constraints. In the same 
way, software requirements should be directly linked to the 
objectives and constraints of the work process it supports. 
Therefore, since there are a number of distinct workflow 
options for control performance management, there can 
also be a number of distinct sets of software requirements 
depending on the planned use case. In this section, we will 
focus on the requirements of a centralized enterprise-wide 
control performance monitoring solution. This section will 
discuss software requirements in terms of the fundamental 
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Control Performance Management Work Flow 



FIG. 11.20 

Control performance management decision tree and workflow. 


elements of a complete control performance management 
solution shown in Figure 11.21. It is not necessary for a soft- 
ware solution to automate every aspect of a complete solution. 
However, these steps will ultimately have to be addressed 
through software functionality or through manual workflow 
processes. The challenge is to create the software/workflow 
combination that strikes the right balance between reliability 
and ease of use. 


Measure 

This is fundamentally about access to data. The door was 
first opened to automated control performance management 
with the arrival of the first direct digital control computer 
systems in the late 1960s. The 1970s and 1980s saw an explo- 
sion of computerized automation suppliers, each with their 
own proprietary hardware, operating system and application 
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Measure 


• Data collection and archival 


Contextualize 

• Map raw measurements into a logical collection or 
concept 

• Put the data in context 


Analyze 

• Distill data into key measures or diagnostic indicators 

• Use diagnostic logic to interpret results 


Orchestrate 

• Take action on the results 

• Manage follow-up activity and track results 


Present 


Reports, graphics, dashboards 


V. 




FIG . 11.21 

Fundamental elements of a complete control performance management solution. 


software. Digital data archives (or “historians”) began to 
crop up soon after the arrival of digital control. These early 
data historians used custom written interfaces to collect digi- 
tal process data and end users were dependent on the his- 
torian vendor for data analysis and visualization tools. The 
turning point came with the formation of a five company 
consortium under the banner of “OPC Task Force” in 1995. 
The official press release in October 1995 announced that 
work was beginning on the OPC communication standard 
described as “a set of interfaces, properties, and methods that 
extend Microsoft’s OLE (Object Linking and Embedding) 
and COM (Component Object Model) technologies for use 
in process control applications.” This consortium was sup- 
ported by Microsoft Corporation and was created to develop 
a Microsoft Windows-based standard communication pro- 
tocol for the process-control industry. It has become the de 
facto industry standard in the years since and has produced 
a growing list of standards, including (1) OPC-DA for real- 
time data access, (2) OPC-HDA for access to data stored 
in digital historians, and (3) OPC-A&E for access to alarm 
and event data. A more recent addition to this collection is 
OPC Unified Architecture. This new standard combines 


the current collection of OPC standards into a single uni- 
fied standard, supports the growing trend toward server-ori- 
ented architecture and creates a more technology agnostic 
approach for the future. 

Enterprise-wide control performance monitoring soft- 
ware requirements for this step include the following: 

1. Universal connectivity: software supports major 
communication standards including; OPC-DA, OPC- 
HDA, OPC-A&E, ODBC. The ability to import data 
from text and spreadsheet files is very convenient for 
offline analysis and troubleshooting. 

2. Data preprocessing: there is a somewhat arbitrary 
line between “data preprocessing” and “analysis.” 
Regardless, there are certain functions that are suf- 
ficiently common to be considered a prerequisite for 
analysis. These include the following: 

a. Remove non-numeric or bad values. 

b. Collate and synchronize data from multiple data 
sources. 

3. Organize preprocessed data in standard format for 
later analysis. 
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FIG. 11.22 

Offline interactive interface for developing and testing data preprocessing expressions. 


4. Interactive offline interface for developing prepro- 
cessing logic and viewing results prior to placing 
analysis online. An example interface is shown in 
Figure 11.22. 

Contextualize 

This is somewhat of an abstract concept but it has widespread 
implications. Contextualization is the act of putting a diverse 
collection of items into a common context for later reference. 
Let’s use the concept of monitoring a control system as an 
example. First, consider all the pertinent information that 
may be required to adequately monitor and manage the per- 
formance of a controller: 

• Control system and historian tag names which refer to 
the measurement or process variable (PV), the desired 
target or setpoint (SP), and the control action or output 
(OP) 

• Control system name which refers to the specific 
instance of the control algorithm, generally PID, 
and all of the associated configuration and tuning 
parameters 

• The equipment numbers and maintenance record of 
associated sensor and actuator equipment 

• Control system and historian tagnames that provide the 
equipment operating statuses necessary for analysis 


• Control room alarm and event information 

• Pertinent laboratory values and operator log 
information 

This information will be dispersed throughout multiple data 
sources: 

• Process historian for storing time-series data from the 
control system 

• Alarm and event database for discrete event 
information 

• Instrument database for maintenance and calibration 
records 

• Work order or maintenance management system 

• Laboratory information management systems 

Early performance monitoring software evolved from offline 
tools and was developed without serious consideration of 
contextualization. As demand for system integration grew, 
these systems rapidly became complex and unmanageable. 
Also, the connections between systems were generally “hard 
wired” and rigid and provided very little flexibility for cus- 
tom reporting or ad hoc analysis. Changes or additions to 
these systems have become awkward and expensive and have 
limited their growth and acceptance. 

More modern applications make better use of the con- 
cepts of metadata and data models. In practice, this means 
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FIG . 11.23 

Templating interface for creating standardized calculations. 


that instead of creating a custom analysis for each controller, 
we instead create an analysis for the common forms includ- 
ing; stand-alone PID, cascade, feedforward, gap action, etc. 
Individual control instances are then mapped to these spe- 
cific classes. This additional up-front effort to provide con- 
text pays huge dividends down the road with much less effort 
required for modifications and additions to the performance 
management solution. 

Therefore, requirements for a modern enterprise-wide 
performance monitoring application include 

1. Support for generalized calculations and visualizations 
as templates: monitoring software should provide a way 
of generalizing calculations. A common way of doing 
this is to create an analysis template whose inputs and 
output are abstracted in a way that makes them re- 
usable. An example interface is shown in Figure 11.23. 

2. Ability to create specific instances of standard calcu- 
lations and visualizations: monitoring software must 
have a way of creating multiple instances of a tem- 
plate and then be able to map template inputs, outputs, 
and parameters to actual tags and database entries. 

An example interface for mapping input tags from a 
historian to the appropriate analysis field is shown in 
Figure 11.24. 


Analyze 

We analyze data to distil it down into much more concise 
metrics that, in theory at least, are much easier to interpret. 
The analysis engine and associated configuration interface 
is the heart of any performance monitoring solution and is 
generally the primary area of focus. The key characteristics 
of this step are 

1. Advanced data preprocessing: These go beyond the 
basic preprocessing steps of removing bad values and 
collating data from various sources. It includes more 
advanced topics as 

a. Interpolation — fill in missing values or create 
interpolated values at pre-defined sample periods 
for simplified time series analysis 

b. Filtering — reduce the influence of noise 

c. Data substitution — apply custom logic to selec- 
tively replace values with calculated alternatives; 
a common application of this is to remove outliers 
by replacing values outside of a certain limit with 
maximum or minimum limit values 

2. Library of analysis functions: The user should not 
have to develop complex analysis functions by hand. 
Modern packages should include such functions as 
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FIG . 11.24 

Interface for adding context to input data tags. 


a. Standard math functions — addition, subtraction, 
multiplication, division, exponential operations, 
powers and roots, trigonometric functions, etc. 

b. Statistical functions — average, variance, standard 
deviation, probability density functions for com- 
mon distributions (Gaussian, Chi-squared, T dis- 
tribution, F distribution) 

c. Logic and data flow functions — conditional 
expressions and Boolean logic, classifier logic, 
selectors, etc. 

d. Time-series and spectral analysis — impulse 
response fitting, autocorrelation, Fourier analysis, 
power spectrum, cross-spectrum, coherence, bico- 
herence, etc. 

e. Multivariate analysis — Principal component anal- 
ysis, partial least squares (also known as “projec- 
tion to latent structures”), correlation maps which 
conveniently show correlation among many vari- 
ables (Figure 11.25) 

f. Specialized analysis — disturbance response indices, 
oscillation detection, valve stiction detection, etc. 

g. Ability to integrate external calculations — COM 
object support, MATLAB® scripts, etc. 

3. High-level diagnostics: Most users will not be experts 
in control performance monitoring concepts. High- 
level diagnostic comments are required in order to gain 
acceptance and provide value for a broader audience. 
Diagnostic indices can be combined with Boolean 


logic to automatically generate comments regarding 
the quality of control, potential causes, and corrective 
actions. This logic can also be used to generate alerts 
and provide the interface between the “analysis” and 
“orchestration” steps. 

4. Analysis scheduling and execution: Developers and 
administrators of control performance monitoring 
solutions require a simple means to execute an analy- 
sis on demand during development and troubleshoot- 
ing. Once the analysis is complete, they require a 
convenient method for scheduling automatic analysis. 
Figure 11.26 provides an example interface provided 
for that purpose. 

Orchestrate 

Performance monitoring is a passive activity and does not 
provide value unless action is taken. A critical aspect of 
enterprise-wide control performance monitoring is taking 
action on the findings and integrating the software tools into 
the site work practices. There is no firm dividing line between 
what is done manually and what is automated. Software will 
not take the place of good training and standard operating 
procedures but automation and ease of use will shorten the 
learning curve and increase acceptance for the tools. The 
following are a list of software features that are technically 
feasible and will help insure the overall success of a control 
performance management program: 
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FIG. 11.25 

Correlation map. 



FIG. 11.26 

Scheduling an automated performance analysis. 
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FIG. 11.27 

Follow-up action triggered by email. 


1. Events that trigger follow-up activity 

Diagnostic logic is the interface between the “analy- 
sis” and “orchestrate” stages of control performance 
management. The logic may be as simple as a maxi- 
mum threshold on an oscillation statistic or a complex 
combination of multiple indices and thresholds. The 
ultimate result is the same; a notification is sent indi- 
cating that one or more controllers require attention. 
A simple example is given in Figure 11.27 where an 
automated email is used to notify the responsible indi- 
vidual of several seriously oscillating control loops. At 
that point, a standard operating procedure would dic- 
tate the next steps for further diagnosing the source of 
the problem and taking appropriate action. Additional 
automation can be achieved by using the email to 
trigger a generation of a work order in a computer- 
ized maintenance management system or kicking off 
a more involved workflow procedure in a standard 
package like Microsoft SharePoint. 

2. Mechanism for prioritizing problems 

It is not uncommon for automated control perfor- 
mance monitoring software to identify one-third of all 
controllers as correctly requiring attention. Treemap 
showing controller performance results are illustrated 
in Figure 11.28. Modern chemical processing facilities 
will have 1000 or more controllers which mean that 
several hundred will be flagged as requiring attention. 
It is not practical or even appropriate to spend time and 
maintenance resources correcting all these problems. 
These problems must be first prioritized based on their 


impact on the core business objectives: safety, reliabil- 
ity, quality, production rate, and costs. Unfortunately, 
there are no simple analytical means for quantifying 
these a priori. The only methods currently in common 
use are 

a. Responsible individuals, typically supervisors or 
production-maintenance coordinators, prescreen 
the list of controllers which require attention and 
either defer the problem or issue a work order. 

b. Loops are assigned a permanent weighting based 
on their relative importance. 

c. A combination of (a) and (b). 

3. Mechanism for assigning follow-up responsibility and 
tracking progress 

A tracking process is required for at least two distinct 
reasons: (1) to insure that problems are addressed and 
(2) to allow the responsible individuals to focus on new 
problems as they arise. As was mentioned previously, 
there may easily be hundreds of problems in a large 
facility with several more cropping up daily. Most of 
these problems will be deferred because they are a low 
priority or they cannot be adequately addressed until a 
later date. In either case, the system needs to be aware 
of the current disposition of the problem and that 
appropriate follow-up action is taking place. A mini- 
mum software requirement is that the software keeps 
track of the follow-up disposition or provides a link 
to an electronic log, automated workflow, or mainte- 
nance management system. Figures 11.28 and 11.29 
show samples of interface where the user can review 
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FIG . 11.28 

Treemap showing controller performance results. 



FIG. 11.29 

Interface for managing follow up. 
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FIG . 11.30 

User interface for identifying loops that require attention. 


the loops that require his attention and then defer the 
problem or select an appropriate follow-up action. 

4. Results tracking 

Every performance management program requires 
a certain amount of focus and commitment by key 
stakeholders to be successful. This focus and com- 
mitment can only be sustained if their ongoing effort 
produces demonstrable results. Often, however, the 
effort required to document benefits is as significant 
as the effort required to correct the problem in the 
first place. Any software features or workflow inte- 
gration that make it easier to quantify and report ben- 
efits will go a long way to ensure that the program 
will be sustained. 

The presentation layer refers to the user interface seen by 
the vast majority of individuals working with the plant-wide 
performance monitoring system. These interfaces are now 


web portals in virtually every example. Key characteristics 
of these web portals and their associated web parts include 

1. Compatible with common web browser technology — 
virtually every modern enterprise application uses 
a web portal as the primary interface for its ease of 
deployment and low overall cost of ownership. Most 
industrial clients will not even consider enterprise-wide 
applications that come with “thick” desktop clients. 

2. Easy to install and configure with “off-the-shelf” 
technology — industrial users of monitoring tech- 
nology are moving more and more to standardized 
technology for their web interfaces. Once again, this 
is primarily driven by a need to control the cost and 
complexity of managing a growing number of large- 
scale applications. Custom interfaces that require spe- 
cialized training to configure and maintain is seen as 
a major disadvantage. 
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3. Logical structure and easy to navigate — users have 
come to expect intuitive navigation and functionality. 
The old adage “if all else fails, read the instructions” is 
now an accepted fact of interface design. An example 
interface is provided in Figure 11.30. This interface 
uses a plant hierarchy in the lower left corner to guide 
the user. Clicking on a node of the hierarchy brings 
up a report that can be toggled between an aggregate 
report showing overall control performance metrics 
for the area or a list of all assets in that area. Default 
filters can be applied to the list to highlight assets with 
specific control problems (shown). 

4. Capable of visualizing large amounts of data — tabu- 
lar displays with filtering and sorting capabilities are 
required but not sufficient for adequately visualizing 
large amounts of data. Other graphical approaches that 
take advantage of size, color, shape, and texture are 
needed to allow quick visual data exploration. Figure 
11.28 shows a treemap of control performance metrics 
for approximately 1700 controllers. Each square repre- 
sents a controller. The squares are collected into com- 
mon units and areas. In this example, the size of each 
square is proportional to the weight or importance of 
the controllers, and the color is proportional to the 
service factor (green for acceptable service factor, red 
for unacceptably low service factor). The user’s eye is 
naturally drawn to the large red boxes that represent 
the important controllers with low service factor. 

5. Can be customized to fit individual requirements — as 
mentioned earlier, control performance monitoring is 
a passive technology and can only add value when it 
is integrated into the site work processes. Therefore, it 
is critical that the interface can be easily customized 
to fit the varied requirements of the different individu- 
als responsible for leveraging these tools to drive their 
business decisions. 

6. Security model — modern web interfaces must be 
designed with a sophisticated multilayered security 
model. At a minimum, this model will apply global 
access restrictions to keep out unauthorized access 
from hackers and internal access restrictions that 
make information for specific assets or hierarchy lev- 
els available to specific authorized users. 
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